cyclical stochastic gradient mcmc
Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
Zhang, Ruqi, Li, Chunyuan, Zhang, Jianyi, Chen, Changyou, Wilson, Andrew Gordon
The posteriors over neural network weights are high dimensional and multimodal. Each mode typically characterizes a meaningfully different representation of the data. We develop Cyclical Stochastic Gradient MCMC (SG-MCMC) to automatically explore such distributions. In particular, we propose a cyclical stepsize schedule, where larger steps discover new modes, and smaller steps characterize each mode. We prove that our proposed learning rate schedule provides faster convergence to samples from a stationary distribution than SG-MCMC with standard decaying schedules. Moreover, we provide extensive experimental results to demonstrate the effectiveness of cyclical SG-MCMC in learning complex multimodal distributions, especially for fully Bayesian inference with modern deep neural networks.
- North America > United States > New York (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.73)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)